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Cln Abstract. Artifact-centric business processes have recently emerged as an ap- 

•^C proach in which processes are centred around the evolution of business entities, 

called artifacts, giving equal importance to control-flow and data. The recent 
Guard-State-Milestone (GSM) approach provides means for specifying business 

__ artifacts lifecycles in a declarative manner, using constructs that match how 

executive-level stakeholders think about their business. However, it turns out 

rf\ that formal verification of GSM is undecidable even for very simple propositional 

temporal properties. We attack this challenging problem by translating GSM into a 
well-studied formal framework. We exploit this translation to isolate an interesting 

i i class of "state-bounded" GSM models for which verification of sophisticated 

temporal properties is decidable. We then introduce some guidelines to turn an 
arbitrary GSM model into a state-bounded, verifiable model. 
Keywords: artifact-centric systems, guard-stage-milestone, formal verification 
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1 Introduction 



( S. In the last decade, a plethora of graphical notations (such as BPMN and EPCs) have been 

*ys proposed to capture business processes. Independently from the specific notation at hand, 

y—i formal verification has been generally considered as a fundamental tool in the process 

design phase, supporting the modeler in building correct and trustworthy process models 
1 16 1 . Intuitively, formal verification amounts to check whether possible executions of the 
business process model satisfy some desired properties, like generic correctness criteria 
(such as deadlock freedom or executability of activities) or domain-dependent constraints. 
To enable formal verification and other forms of reasoning support, the business process 
language gets translated into a corresponding formal representation, which typically 
relies on variants of Petri nets [1|, transition systems [2|, or process algebras [18]. 
Properties are then formalized using temporal logics, using model checking techniques 
to actually carry out verification tasks |8|. 

A common drawback of classical process modeling approaches is being activity- 
centric: they mainly focus on the control-flow perspective, lacking the connection 
between the process and the data manipulated during its executions. This reflects also 
in the corresponding verification techniques, which often abstract away from the data 
component. This "data and process engineering divide" affects many contemporary 
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process-aware information systems, incrementing the amount of redundancies and po- 
tential errors in the development phase [ 12 1. To tackle this problem, the artifact-centric 
paradigm has recently emerged as an approach in which processes are guided by the 
evolution of business data objects, called artifacts [ 17 9|. A key aspect of artifacts is 
coupling the representation of data of interest, called information model, with lifecy- 
cle constraints, which specify the acceptable evolutions of the data maintained by the 
information model. On the one hand, new modeling notations are being proposed to 
tackle artifact-centric processes. A notable example is the Guard-State-Milestone (GSM) 
graphical notation [ 10], which corresponds to way executive-level stakeholders concep- 
tualize their processes [7 1. On the other hand, formal foundations of the artifact-centric 
paradigm are being investigated in order to capture the relationship between processes 
and data and support formal verification 111151151 . Two important issues arise in this 
setting. First, verification formalisms must go beyond propositional temporal logics, 
and incorporate first-order formulae to express constraints about the evolution of data 
and to query the information model of artifacts. Second, formal verification becomes 
much more difficult than for classical activity-centric approaches, even undecidable in 
the general case. 

In this work, we tackle the problem of automated verification of GSM models. 
First of all, we show that verifying GSM models is indeed a very challenging issue, 
being undecidable in general even for simple propositional reachability properties. We 
then provide a sound and complete encoding of GSM into Data-Centric Dynamic 
Systems (DCDSs), a recently developed formal framework for data- and artifact-centric 
processes 1 15). This encoding allows to reproduce in the GSM context the decidability 
and complexity results recently established for DCDSs with bounded information models 
{state-bounded DCDSs). These are DCDSs where the number of tuples does not exceed 
a given maximum value. This does not mean that the system must contain an overall 
bounded number of data: along a run, infinitely many data can be encountered and 
stored into the information model, provided that they do not accumulate in the same 
state. We lift this property in the context of GSM, and show that verification of state- 
bounded GSM models is decidable for a powerful temporal logic, namely a variant of 
first-order /i-calculus supporting a restricted form of quantification [ 13 1. We then isolate 
an interesting class of GSM models for which state-boundedness is guaranteed, and 
introduce guidelines that can be employed to turn any GSM model into a state-bounded, 
verifiable model. 

The rest of the paper is organized as follows. Section 2 gives an overview of GSM 
and provides a first undecidability result. Section 3 introduces DCDSs and presents 
the GSM-DCDS translation. Section 4 introduces "state-bounded" GSM models and 
provides key decidability results. Discussion and conclusion follow. 



2 GSM modeling of Artifact- Centric Systems 

The foundational character of artifact-centric business processes is the combination 
of static properties, i.e., the data of interest, and dynamic properties of a business 
process, i.e., how it evolves. Artifacts, the key business entities of a given domain, are 
characterized by (i) an information model that captures business-relevant data, and (ii) 
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a lifecycle model that specifies how the artifact progresses through the business. In 
this work, we focus on the Guard-Stage-Milestone (GSM) approach for artifact-centric 
modeling, recently proposed by IBM 1 10 1. GSM is a declarative modeling framework 
that has been designed with the goal of being executable and at the same time enough 
high-level to result intuitive to executive-level stakeholders. The GSM information 
model uses (possibly nested) attribute/value pairs to capture the domain of interest. 
The key elements of a lifecycle model are stages, milestones and guards. Stages are 
(hierarchical) clusters of activities (tasks), intended to update and extend the data of the 
information model. They are associated to milestones, business operational objectives to 
be achieved when the stage is under execution. Guards control the activation of stages 
and, like milestones, are described in terms of data-aware expressions, called sentries, 
involving events and conditions over the artifact information model. Sentries have the 
form on e if cond, where e is an event and cond is an (OCL-based) condition over data. 
Both parts are optional, supporting pure event-based or condition-based sentries. Tasks 
represent the atomic units of work. Basic tasks are used to update the information model 
of some artifact instance (e.g., by using the data payload associated to an incoming event). 
Other tasks are used to add/remove a nested tuple. A specific create-artifact-instance 
task is instead used to create a new instance of a given artifact type; this is done by means 
of a two-way service call, where the result is used to create a new tuple for the artifact 
instance, assign a new identifier to it, and fill it with the result's payload. Obviously, 
another task exists to remove a given artifact instance. In the following, we use model 
for the intensional level of a specific business process described in GSM, and instance to 
denote a GSM model with specific data for its information model. 

The execution of a business process may involve several instances of artifact types 
described by a GSM model. At any instant, the state of an artifact instance (snapshot) is 
stored in its information model, and is fully characterised by: (i) values of attributes in 
the data model, (ii) status of its stages (open or closed) and (Hi) status of its milestones 
(achieved or invalidated). Artifact instances may interact with the external world by 
exchanging typed events. In fact, tasks are considered to be performed by an external 
agent, and their corresponding execution is captured with two event types: a service call, 
whose instances are populated by the data from information model and then sent to the 
environment; and a service call return, whose instances represent the corresponding 
answer from the environment and are used to incorporate the obtained result back into 
the artifact information model. The environment can also send unsolicited (one-way) 
events, to trigger specific guards or milestones. Additionally, any change of a status 
attribute, such as opening a stage or achieving a milestone, triggers an internal event, 
which can be further used to govern the artifact lifecycle. 

Example 1. FigureMlshows a simple order management process modeled in GSM. The process 
centers around an order artifact, whose information model is characterized by a set of status 
attributes (tracking the status of stages and milestones), and by an extendible set of ordered items, 
each constituted by a code and a quantity. The order lifecycle contains three top-level atomic stages 
(rounded rectangles), respectively used to manage the manipulation of the order, its payment, and 
the delivery of a payment receipt. The order management stage contains a task (rectangle) to 
add items to the order. It opens every time an itemRequest event is received, provided that the 
order has not yet been paid. This is represented using a logical condition associated to a guard 
(diamond). The stage closes when the task is executed, by achieving an "item added" milestone 
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on payRequest/\ 
if order. items -> exists\y 



on itemRequestJ\ 
if not Order paid\y 



execute 
payment 



Order paid 

code qty 



Receipt sent 



status attributes items 



Fig. 1: GSM model of a simple order management process 



(circle). A payment can be executed once a pay Request event is issued, provided that the order 
contains at least one item (verified by the OCL condition order.items — > exists). As soon as 
the order is paid, and the corresponding milestone achieved, the receipt delivery stage is opened. 
This direct dependency is represented using a dashed arrow, which is a shortcut for the condition 
on Order paid, representing the internal event of achieving the "Order paid" milestone. 

2.1 Operational semantics of GSM 

GSM is associated to three well-defined, equivalent execution semantics, which disci- 
pline the actual enactment of a GSM model [ 10 1. Among these, the GSM incremental 
semantics is based on a form of Event-Condition-Action (ECA) rules, called Prerequisite- 
Antecedent-Consequent (PAC) rules, and is centered around the notion of GSM Business 
steps (B-steps). An artifact instance remains idle until it receives an incoming event from 
the environment. It is assumed that such events arrive in a sequence and get processed 
by artifact instances one at a time. A B-step then describes what happens to an artifact 
snapshot E, when a single incoming event e is incorporated into it, i.e., how it evolves 
into a new snapshot E' (see Figure 5 in 1 10|). E' is constructed by building a sequence 
of pre-snapshots Ei, where E\ results from incorporating e into E by updating its 
attributes, one at a time, according to the event payload (i.e., its carried data). Each 
consequent pre-snaphot Ei is obtained by applying one of the PAC rules to the previous 
pre-snapshot Z7j_i. Each of such transitions is called a micro-step. During a micro-step 
some outgoing events directed to the environment may be generated. When no more PAC 
rules can be applied, the last pre-snapshot E' is returned, and the entire set of generated 
events is sent to the environment. 

Each PAC rule is associated to one or more GSM constructs (e.g. stage, milestone) 
and has three components: 

- Prerequisite: this component refers to the initial snapshot E and determines if a 
rule is relevant to the current B-step processing an incoming event e. 

- Ancedent: this part refers to the current pre-snapshot Ei and determines whether 
the rule is eligible for execution, or executable, at the next micro-step. 

- Consequent: this part describes the effect of firing a rule, which can be nondeter- 
ministically chosen in order to obtain the next-pre-snapshot Ei + \. 

Due to nondeterminism in the choice of the next firing rule, different orderings among 
the PAC rules can exist, leading to non-intuitive outcomes. This is avoided in the 
GSM operational semantics by using an approach reminiscent of stratification in logic 
programming. In particular, the approach (i) exploits implicit dependencies between the 
(structure of) PAC rules to fix an ordering on their execution, and (ii) applies the rules 
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according to such ordering iflOl . To guarantee B-step executability, avoiding situations 
in which the execution indefinitely loops without reaching a stable state, the GSM 
incremental semantics implements a so-called toggle-once principle. This guarantees 
that a sequence of micro-steps, triggered by an incoming event, is always finite, by 
ensuring that each status attribute can change its value at most once during a B-step. 
This requirement is implemented by an additional condition in the prerequisite part of 
each PAC rule, which prevents it from firing twice. 

The evolution of a GSM system composed by several artifacts can be described 
by defining the initial state (initial snapshot of all artifact instances) and the sequence 
of event instances generated by the environment, each of which triggers a particular 
B-step, producing a sequence of system snapshots. This perspective intuitively leads 
to the representation of a GSM model as an infinite- state transition system, depicting 
all possible sequences of snapshots supported by the model. The initial configuration 
of the information model represents the initial state of this transition system, and the 
incremental semantics provides the actual transition relation. The source of infinity relies 
in the payload of incoming events, used to populate the information model of artifacts 
with fresh values (taken from an infinite/arbitrary domain). Since such events are not 
under the control of the GSM model, the system must be prepared to process such 
events in every possible order, and with every acceptable configuration for the values 
carried in the payload. The analogy to transition systems opens the possibility of using 
a formal language, e.g., a (first-order variant of) temporal logic, to verify whether the 
GSM system satisfies certain desired properties and requirements. For example, one 
could test generic correctness properties, such as checking whether each milestone can 
be achieved (and each stage will be opened) in at least one of the possible systems' 
execution, or that whenever a stage is opened, it will be always possible to eventually 
achieve one of its milestones. Furthermore, the modeler could also be interested in 
verifying domain-specific properties, such as checking whether for the GSM model in 
Figure[T]it is possible to obtain a receipt before the payment is processed. 

2.2 Undecidability in GSM 

In this section, we show that verifying the infinite-state transition system representing 
the execution semantics of a given GSM model is an extremely challenging problem, 
undecidable even for a very simple propositional reachability property. 

Theorem 1. There exists a GSM model for which verification of a propositional reacha- 
bility property is undecidable. 

Proof To show undecidability of verification, we illustrate that a Turing machine can 
be easily captured in GSM, and that the halting problem can be stated in terms of 
a verification problem. In particular, we consider a deterministic, single tape Turing 
machine A4 = (Q, S, qo, S, qf, J), where Q is a finite set of (internal) states, S = 
{0, 1, J\ is the tape alphabet (with u the blank symbol), q £ Q and qj £ Q are the 
initial and final state, and S C Q \ {qf} x S x Q x U x {L, R} is a transition relation. 
We assume, wlog, that S consists of k right-shift transitions Ri, . . . , R). (those having 
R as last component), and n left-shift transitions L\, . . . ,L n (those having L as last 
component). The idea of translation into a GSM model is the following. Beside status 
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Fig. 2: GSM model of a Turing machine 



attributes, the GSM information model is constituted by: (i) a curState slot containing 
the current internal state q e Q; (ii) a curCell slot pointing to the cell where the head 
of Ai is currently located. (Hi) a collection of cells representing the current state of 
the tape. Each cell is a complex nested record constituted by a value v E S, and two 
pointers prev and next used to link the cell to the previous and next cells. In this way, 
the tape is modeled as a linked list, which initially contains a single, blank cell, and 
which is dynamically extended as needed. To mark the initial (resp., last) cell of the tape, 
we assume that its prev (next) cell is null. 

On top of this information model, a GSM lifecyle that mimics A4 is shown in 
Figurep] where, due to space constraints, only the right-shift transitions are depicted (the 
left-shift ones are symmetric). The schema consists of two top-level stages. Init stage is 
used to initialize the tape. Transition stage is instead used to mimic the execution of one 
of the transitions in S. Each transition is decomposed into two sub-stages: state update 
and head shift. The state update is modeled by one among k + n atomic sub-stages, 
each handling the update that corresponds to one of the transitions in 6. These stages are 
mutually exclusive, being A4 deterministic. Consider for example a right-shift transition 
Ri = S(qRi, vRijqR'i, vR' t , R) (the treatment is similar for a left-shift transition). The 
corresponding state update stage is opened whenever the current state is qR t , and the 
value contained in the cell pointed by the head is vRi (this can be extracted from the 
information model using the query curCell.value). The incoming arrows from the two 
parent's guards ensures that this condition is evaluated as soon as the parent stage is 
opened; hence, if the condition is true, the state update stage is immediately executed. 
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When the state update stage is closed, the achievement of the corresponding milestone 
triggers one of the guards of the Right shift stage that handles the head shift. It contains 
two sub-stages: the first one extends the tape if the head is currently pointing to the last 
cell, while the second one just perform the shifting. Whenever a right or left shift stage 
achieves the corresponding milestone, then also the parent, transition stage is closed, 
achieving milestone "Transition done". This has the effect of re-opening the transition 
stage again, so as to evaluate the next transition to be executed. An alternative way of 
immediately closing the transition stage occurs when the current state corresponds to the 
final state qf. In this case, milestone "Halt" is achieved, and the execution terminates 
(no further guards are triggered). 

By considering this construction, the halting problem for Ai can be rephrased as the 
following verification problem: given the GSM model encoding Ai, and starting from 
an initial state where the information model is empty, is it possible to reach a state where 
the "Halt" milestone is achieved? Notice that, since M. is deterministic, the B-steps of 
the corresponding GSM model constitute a linear computation, which could eventually 
reach the "Halt" milestone or continue indefinitely. Therefore, reaching a state where 
"Halt" is achieved can be equivalently formulated using propositional CTL or LTL. □ 

3 Translation into Data- Centric Dynamic Systems 

We discuss a translation procedure that faithfully rewrites a GSM model into a corre- 
sponding formal representation in terms of a Data-Centric Dynamic System (DCDS), 
for which interesting decidability results have been recently obtained. 

DCDSs are a formal framework for the specification of data-aware business processes, 
i.e., systems where the connection between the process perspective and the manipulated 
data is explicitly tackled [3|. Technically, a DCDS is a pair S = (V^V), where T> 
is a data layer and V is a process layer over V. T> maintains all the relevant data 
in the form of a relational database with integrity constraints. In the artifact-centric 
context, the database is constituted by the union of all artifacts information models. The 
process layer V changes and evolves the data maintained by T>. It is constituted by a 
tuple V = (J 7 , A, g) . J- is a finite set of functions representing interfaces to external 
services, used to import new, fresh data into the system. A is a set of actions of the form 
a(pi, ...,p n ) ■ {ei, ..., e m }, where a is the action name, pi, ...,p n are input parameters, 
and a are effect specifications. Each effect specification defines how a portion of the next 
database instance is constructed starting from the current one. Technically, its form is 
Q ~-> E, where: (i) Q is a query over T> that could involve action parameters, and is meant 
to extract tuples from the current database; (ii) E is a set of effects, specified in terms of 
facts over T> that will be asserted in the next state; these facts can contain variables of Q 
(which are then replaced with actual values extracted from the current database), and also 
service calls, which are resolved by callingthe service with actual input parameters and 
substituting them with the obtained result^Finally, g is a declarative process specified 
in terms of Condition-Action (CA) rules that determine, at any moment, which actions 
are executable. Technically, each CA rule has the form Qh>«, where Q is a query over 



1 In [31, two semantics for services are introduced: deterministic and nondeterministic. Here we 
always assume nondeterministic services, which is in line with GSM. 
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T>, and a is an action. Whenever Q has a positive answer over the current database, then 
a becomes executable, with actual values for its parameters given by the answer to Q. 

The execution semantics of a DCDS S is defined by a possibly infinite-state transition 
system Y$, where states are instances of the database schema in T> and each transition 
corresponds to the application of an executable action in V '. Similarly to GSM, where the 
source of infinity comes from the fact that incoming events carry an arbitrary payload, in 
DCDSs the source of infinity relies in the service calls, which can inject arbitrary fresh 
values into the system. 

We recall some key (un)decidability and complexity results related to DCDSs, which 
will be then used to study the formal verification of GSM. 

Theorem 2 ((3)). There exists a DCDS for which verification of a propositional safety 
property expressible in LTL [~l CTL is undecidable. 

This result comes from the high expressiveness of DCDSs. In fact, we will see that 
DCDSs can encode GSM. However, alongside this undecidability result, identifies an 
interesting class of state-bounded DCDSs, for which decidability of verification holds 
for a sophisticated (first-order) temporal logic called [iHp. Intuitively, state boundedness 
requires the existence of an overall bound that limits, at every point in time, the size of 
the database instance of S (without posing any restriction on which values can appear in 
the database). Equivalently, the size of each state contained in Ts cannot exceed the pre- 
established bound. Hence, in the following we will indifferently talk about state-bounded 
DCDSs or state-bounded transition systems. 

Theorem 3 (12)). Verification ofpCp properties over state-bounded DCDS is decidable, 
and can be reduced to finite-state model checking of propositional p-calculus. 

p,Cp is a first-order variant of ^-calculus, a rich branching-time temporal logic that 
subsumes all well-known temporal logics such as PDL, CTL, LTL and CTL* IT3ll . 
p,Cp employs first-order formulae to query data maintained by the DCDS data layer, 
and supports a controlled form of first-order quantification across states (within and 
across runs). In particular, pCp requires that the values in the scope of quantification 
continuously persist for the quantification to take effect. As soon as a value is not present 
in the current database anymore, a formula talking about it collapses to true or false. 
This restriction is in line with the artifact-centric setting, where a given artifact identifier 
points to the same artifact until such an artifact is live, but as soon as the artifact is 
destroyed, it can be recycled to identify a completely different artifact (and it would be 
incorrect to consider it the same as before). 

Example 2. jiCp can express two variants of a correctness requirement for GSM: 

- it is always true that, whenever an artifact id is present in the information model, the corre- 
sponding artifact will be destroyed (i.e., the id will disappear) or reach a state where all its 
stages are closed; 

- it is always true that, whenever an artifact id is present in the information model, the corre- 
sponding artifact will persist until a state is reached where all its stages are closed. 
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3.1 Translating GSM into DCDS 

For the sake of space, we only discuss the intuition behind the translation and provide 
the main results. For a full technical development, we refer the interested reader to a 
technical report |fl9l . 



As introduced in Section 2.1 the execution of a GSM instance is described by a 
sequence of B-steps. Each B-step consists of an initial micro-step which incorporates 
incoming event into current snapshot, a sequence of micro-steps executing all applicable 
PAC-rules, and finally a micro-step sending a set of generated events at the termination 
of the B-step. The translation relies on the incremental semantics: given a GSM model Q, 
we encode each possible micro-step as a separate condition-action rule in the process of 
a corresponding DCDS system S, such that the effect on the data and process layers of 
the action coincides with the effect of the corresponding micro-step in GSM. However, 
in order to guarantee that the transition system induced by a resulting DCDS mimics the 
one of the GSM model, the translation procedure should also ensure that all semantic 



requirements described in Section 2.1 are modeled properly: (i) "one-message-at-a-time" 
and "toggle-once" principles, (ii) the finiteness of micro-steps within a B-step, and 
(Hi) their order imposed by the model. We sustain these requirements by introducing into 
the data layer of S a set of auxiliary relations, suitably recalling them in the CA-rules to 
reconstruct the desired behaviour. 

Restricting S to process only one incoming message at a time is implemented 
by the introduction of a blocking mechanism, represented by an auxiliary relation 
Rbiock(idji, blocked) for each artifact in the system, where id^ is the artifact instance 
identifier, and blocked is a boolean flag. This flag is set to true upon receiving an 
incoming message, and is then reset to false at the termination of the corresponding 
B-step, once the outgoing events accumulated in the B-step are sent the environment. If 
an artifact instance has blocked = true, no further incoming event will be processed. 
This is enforced by checking the flag in the condition of each CA-rule associated to the 
artifact. 

In order to ensure "toggle once" principle and guarantee the finiteness of sequence 
of micro-steps triggered by an incoming event, we introduce an eligibility tracking mech- 
anism. This mechanism is represented by an auxiliary relation R exec (idn,Xi, ..., x c ), 
where c is the total number of PAC-rules, and each x. L corresponds to a certain PAC-rule 
of the GSM model. Each x. b encodes whether the corresponding PAC rule is eligible 
to fire at a given moment in time (i.e., a particular micro-step). The initial setup of the 
eligibility tracking flags is performed at the beginning of a B-step, based on the evalu- 
ation of the prerequisite condition of each PAC rule. More specifically, when X{ = 0, 
the corresponding CA-rule is eligible to apply and has not yet been considered for 
application. When instead Xi = 1, then either the rule has been fired, or its prerequisite 
turned out to be false. This flag-based approach is used to propagate in a compact way 
information related to the PAC rules that have been already processed, following a 
mechanism that resembles dead path elimination in BPEL. In fact, R eX ec is also use d to 
enforce a firing order of CA-rules that follows the one induced by Q. This is achieved 
as follows. For each CA-rule Q n- a corresponding to a given PAC rule r, condition 
Q is put in conjunction with a further formula, used to check whether all the PAC rules 
that precede r according to the ordering imposed by Q have been already processed. 
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Fig. 3: CA-rule encoding a milestone invalidation upon stage activation 
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Fig. 4: Construction of the B-step transition system Tg and unblocked-state transition 
system Tg for a GSM model Q with initial snapshot Sq and the corresponding DCDS S 



Only in this case r can be considered for application, consequently applying its effect a 
to the current artifact snapshot. More specifically, the corresponding CA-rule becomes 
Q A exec(r) H> a, where exec(r) = A. a;, such that i ranges over the indexes of those 
rules that precede r. 

Once all Xi flags are switched to 1, the B-step is about to finish: a dedicated CA-rule 
is enabled to send the outgoing events to the environment, and the artifact instance 
blocked flag is released. 

Example 3. An example of a translation of a GSM PAC-rule (indexed by k) is presented in 
Figure [3] For simplicity, multiple parameters are compacted using an "array" notation (e.g., 
Xi , . . . , x n is denoted by x). In particular: ( 1 ) represents a condition part of a CA-rule, ensuring the 
"toggle-once" principle (Xk = 0), the compliant firing order (exec(k)) and the "one-message-at-a- 
time" principle (Ruock (idn,true)); (2) describes the action signature; (3) is an effect encoding 
the invalidation a milestone once the stage has been activated; (4) propagates an internal event 
denoting the milestone invalidation, if needed; (5) flags the encoded micro-step corresponding to 
PAC rule k as processed; (6) transports the unaffected data into the next snapshot. 

Given a GSM model Q with initial snapshot So, we denote by Tg its B-step transition 
system, i.e., the infinite-state transition system obtained by iteratively applying the 
incremental GSM semantics starting from Sq and nondeterministically considering each 
possible incoming event. The states of Tg correspond to stable snapshots of Q, and 
each transition corresponds to a B-step. We abstract away from the single micro-steps 
constituting a B-step, because they represent temporary intermediate states that are 
not interesting for verification purposes. Similarly, given the DCDS S obtained from 
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the translation of Q, we denote by Tg its unblocked-state transition system, obtained 
by starting from Sq, and iteratively applying nondeterministically the CA-rules of the 
process, and the corresponding actions, in all the possible ways. As for states, we only 
consider those database instances where all artifact instances are not blocked; these 
correspond in fact to stable snapshots of Q. We then connect two such states provided 
that there is a sequence of (intermediate) states that lead from the first to the second one, 
and for which at least one artifact instance is blocked; these sequence corresponds in fact 
to a series of intermediate-steps evolving the system from a stable state to another stable 
state. Finally, we project away all the auxiliary relations introduced by the translation 
mechanism, obtaining a. filtered version of Tg, which we denote as Tg\g. The intuition 
about the construction of these two transition systems is given in Figure HI Notice that 
the intermediate micro-steps in the two transition systems can be safely abstracted 
away because: (i) thanks to the toggle-once principle, they do not contain any "internal" 
cycle; (ii) respecting the firing order imposed by Q, they all lead to reach the same next 
stable/unblocked state. We can then establish the one-to-one correspondence between 
these two transition systems in the following theorem (refer to [ 19 1 for complete proof): 

Theorem 4. Given a GSM model Q and its translation into a corresponding DCDS S, 
the corresponding B-step transition system Tg and filtered unblocked-state transition 
system T$\g are equivalent, i.e., Tg = T$\g- 

4 State-bounded GSM models 

We now take advantage of the key decidability result given in Theorem [3] and study 
verifiability of state-bounded GSM models. Observe that state-boundedness is not a too 
restrictive condition. It requires each state of the transition system to contain a bounded 
number of tuples. However, this does not mean that the system in general is restricted to 
encounter only a limited amount of data: infinitely many values may be distributed across 
the states (i.e. along an execution), provided that they do not accumulate in the same 
state. Furthermore, infinitely many executions are supported, reflecting that whenever an 
external event updates a slot of the information system maintained by a GSM artifact, 
infinitely many successor states in principle exist, each one corresponding to a specific 
new value for that slot. To exploit this, we have first to show that the GSM-DCDS 
translation preserves state-boundedness, which is in fact the case. 

Lemma 1. Given a GSM model Q and its DCDS translation S, Q is state-bounded if 
and only if S is state-bounded. 

Proof. Recall that S contains some auxiliary relations, used to restrict the applicability 
of CA-rules in order to enforce the execution assumptions of GSM: (i) the eligibility 
tracking table R eX ec, (H) the artifact instance blocking flags Rbiock, (Hi) the internal 
message pools ti%"£, #££, R™ t 9 ", and (iv) the tables of status changes R™j g , R s J hg . 
(-<=) This is directly obtained by observing that, if Tg is state-bounded, then also Tg\g 
is state-bounded. From Theorem ffl we know that Tg\g = Tg, and therefore Tg is 
state-bounded as well. 
(=>) We have to show that state boundedness of Q implies that also all auxiliary relations 
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present in Y$ are bounded. We discuss each auxiliary relation separately. The artifact 
blocking relation Ruock keeps a boolean flag for each artifact instance, so its cardinality 
depends on the number of instances in the model. Since the model is state-bounded, the 
number of artifact instances is bounded and so is Rbiock- The eligibility tracking table 
Rexec stores for each artifact instance a boolean vector describing the applicability of 
a certain PAC rule. Since the number of instances is bounded and so is the set of PAC 
rules, then the relation R eX ec is also bounded. Similarly, one can show the boundedness 
°f ^cha' ^chn d ue to me f act mat me number of stages and milestones is fixed a-priori. 
Let us now analyze internal message pools. By construction, S may contain at most one 
tuple in R™a t 9 a k and R dat p a for each artifact instance. This is enforced by the blocking 
mechanism Rbiock, which blocks the artifact instance at the beginning of a B-step 
and prevents the instance from injecting further events in internal pools. The outgoing 
message pool R™/ 9 may contain as much tuples per artifact instance as the amount 
of atomic stages in the model, which is still bounded. However, neither incoming nor 
outgoing messages are accumulated in the internal pool along the B-steps execution, 
since the final micro-step of the B-step is designed not to propagate any of the internal 
message pools to the next snapshot. Therefore, Y$ is state-bounded. 

□ 

From the combination of Theorems [3] and HI and LemmafT] we directly obtain: 

Theorem 5. Verification offiCp properties over state-bounded GSM models is decid- 
able, and can be reduced to finite-state model checking of propositional /i-calculus. 

Obviously, in order to guarantee verifiability of a given GSM model, we need to under- 
stand whether it is state-bounded or not. However, state-boundedness is a "semantic" 
condition, which is undecidable to check 1151 . We mitigate this problem by isolating a 
class of GSM models that is guaranteed to be state-bounded. We show however that even 
very simple GSM models (such as Fig. [TJ, are not state-bounded, and thus we provide 
some modelling strategies to make any GSM model state-bounded. 



GSM Models without Artifact Creation. We investigate the case of GSM models 
that do not contain any create-artifact-instance tasks. Without loss of generality, we 
assimilate the creation of nested datatypes (such as those created by the "add item" task 
in ExamplefTli to the creation of new artifacts. From the formal point of view, we can in 
fact consider each nested datatype as a simple artifact with an empty lifecycle, and its 
own information model including a connection to its parent artifact. 

Corollary 1. Verification of fiCp properties over GSM models without create-artifact- 
instance tasks is decidable. 

Proof Let Q be a GSM model without create-artifact-instance tasks. At each stable 
snapshot S^, Q can either process an event representing an incoming one-way message, 
or the termination of a task. We claim that the only source of state-unboundedness can be 
caused by service calls return related to the termination of create-artifact-instance tasks. 
In fact, one-way incoming messages, as well as other service call returns, do not increase 
the size of the data stored in the GSM information model, because the payload of such 
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Fig. 5: Unbounded execution of the GSM model in Fig. fll 



messages just substitutes the values of the corresponding data attributes, according to the 
signature of the message. Similarly, by an inspection of the proof of LemmafT] we know 
that across the micro-steps of a B-step, status attributes are modified but their size does 
not change. Furthermore, a bounded number of outgoing events could be accumulated 
in the message pools, but this information is then flushed at the end of the B-step, thus 
bringing the size of the overall information model back to the same size present at the 
beginning of the B-step. Therefore, without create-artifact-instance tasks, the size of 
the information model in each stable state is constant, and corresponds to the size of the 
initial information model. We can then apply TheoremBlto get the result. □ 

Arbitrary GSM Models. The types of models studied in paragraph above are quite 
restrictive, because they forbid the possibility of extending the number of artifacts during 
the execution of the system. On the other hand, as soon as this is allowed, even very 
simple GSM models, as the one shown in Fig. [Tl may become state unbounded. In 
that example, the source of state unboundedness lies in the stage containing the "add 
item" task, which could be triggered an unbounded number of times due to continuous 
itemRequest incoming events, as pointed out in Fig. B] This, in turn, is caused by the 
fact that the modeler left the GSM model underspecified, without providing any hint 
about the maximum number of items that can be included in an order. To overcome this 
issue, we require the modeler to supply such information (stating, e.g., that each order 
is associated to at most 10 items). Technically, the GSM model under study has to be 
parameterized by an arbitrary but finite number N max , which denotes the maximum 
number of artifact instances that can coexist in the same execution state. We call this 
kind of GSM model instance bounded. A possible policy to provide such bound is to 
allocate available "slots" for each artifact type of the model, i.e. to specify a maximum 
number Na ( for each artifact type Ai, then having N max = J2i-^Ai- In order to 
incorporate the artifact bounds into the execution semantics, we proceed as follows. 
First, we pre-populate the initial snapshot of the considered GSM instance with N max 
blank artifact instances (respecting the relative proportion given by the local maximum 
numbers for each artifact type). We refer to one such blank artifact instance as artifact 
container. Along the system execution, each container may be: ( i) filled with concrete 
data carried by an actual artifact instance of the corresponding type, or (ii) flushed to 
the initial, blank state. To this end, each artifact container is equipped with an auxiliary 
flag fn, which reflects its current state: fn is false when the container stores a concrete 
artifact instance, true otherwise. Then, the internal semantics of create-artifact-instance 
is changed so as to check the availability of a blank artifact container. In particular, when 
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the corresponding service call is to be invoked with the new artifact instance data, the 
calling artifact instance selects the next available blank artifact container, sets its flag /r< 
to false, and fills it with the payload of the service call. If all containers are occupied, 
the calling artifact instance waits until some container is released. Symmetrically to 
artifact creation, the deletion procedure for an artifact instance is managed by turning 
the corresponding container flag fn to true. Details on the DCDS CA-rules formalizing 
creation/deletion of artifact instances according to these principles can be found in 1 19). 
We observe that, following this container-based realization strategy, the informa- 
tion model of an instance-bounded GSM model has a fixed size, which polinomially 
depends on the total maximum number N max . The new implementation of create- 
artifact-instance does not really change the size of the information model, but just 
suitably changes its content. Therefore, Corollary [Tldirectly applies to instance-bounded 
GSM models, guaranteeing decidability of their verification. Finally, notice that infinitely 
many different artifact instances can be created and manipulated, provided that they do 
not accumulate in the same state (exceeding N max ). 



5 Discussion and related work 

In this work we have provided the foundations for the formal verification of the GSM 
artifact-centric paradigm. After having proven undecidability of verification in the 
general case, we have shown decidability of verification for a very rich first-order 
temporal logic, tailored to the artifact-centric setting, for an interesting class of "state- 
bounded" GSM models. 

So far, only few works have investigated verification of GSM models. The closest 
approach to ours is [6|, where state-boundedness is also used as a key property towards 
decidability. The main difference between the two approaches is that decidability of state- 
bounded GSM models is proven for temporal logics of incomparable expressive power. 
In addition to [6 1, in this work we also study modeling strategies to make an arbitrary 
GSM model state-bounded, while they assume that the input model is guaranteed to 
be state-bounded. Hence, our strategies could be instrumental to [6| as well. In [14|, 
another promising technique for the formal verification of GSM models is presented. 
However, the current implementation cannot be applied to general GSM models, because 
of assumptions over the data types and the fact that only one instance per artifact type is 
supported. Furthermore, a propositional branching-time logic is used for verification, 
restricting to the status attributes of the artifacts. The results presented in our paper can 
be used to generalize this approach towards more complex models (such as instance- 
bounded GSM models) and more expressive logics, given, e.g., the fact that "one-instance 
artifacts" fall inside the decidable cases we discussed in this paper. 

It is worth noting that all the presented decidability results are actually even stronger: 
they state that verification can be reduced to standard model checking of propositional /i- 
calculus over finite-state transition systems (thanks to the abstraction techniques studied 
in 03]]). This opens the possibility of actually implementing the discussed techniques, 
by relying on state-of-the-art model checkers. We also inherit from [15| the complexity 
boundaries: they state that verification is ExpTime in the size of the GSM model which, 
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in the case of instance-bounded GSM models, means in turn ExpTime in the maximum 
number of artifact instances that can coexist in the same state. 

Beside implementation-related issues, we also aim to reassess the results presented 
here in a setting where GSM relies on a rich knowledge base (a description logic 
ontology) for its information model, in the spirit of |4). 
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